The Engineer's Alpha

Posts tagged with "Prompt Engineering"

Claude Code Skill Safety: From 'Please Stop' to 'You Can't Move'

38 Skills, three layers of defense, one hard lesson: natural language instructions are not a safety mechanism. How I systematically hardened 12 unprotected destructive Skills with PreToolUse Hooks, Skill splitting, and disallowed-tools.

Claude Code Skill 安全性：從「拜託你停下來」到「你根本動不了」

38 個 Skills、三層防護、一個血淚教訓：自然語言指令不是安全機制。本文記錄我如何用 PreToolUse Hook、Skill 拆分和 disallowed-tools 系統性地修補 12 個毫無 checkpoint 的破壞性 Skills。

Safety Gates in Claude Code Skills: From Auditing 35 Skills to a Three-Layer Protection Model

I assumed writing 'Use AskUserQuestion' in a Skill was a hard constraint. After auditing 35 Skills, reading the official docs, and digging through GitHub Issues, I found out: the model uses the same mechanism to decide whether to obey your CHECKPOINT and whether to invoke your tool. There's only one gate that's truly 100%.

Claude Code Skill 的安全閘門：從 35 個 Skills 的審計到三層防護模型

我以為在 Skill 裡寫 Use AskUserQuestion 就是 hard constraint。審計完 35 個 Skills、查完官方文檔和 GitHub Issues 之後發現，模型用同一套機制決定要不要理你的 CHECKPOINT 和要不要調用你的 tool。真正 100% 的閘門只有一個。